CSTParser – a multi-document discourse parser

نویسندگان

  • Erick Galani Maziero
  • Thiago Alexandre Salgueiro Pardo
چکیده

This paper presents the CSTParser, a multi-document discourse parser. Based on machine learning techniques and hand-crafted rules, the system identifies a set of relations predicted by CST (Cross-document Structure Theory) among sentences of different texts on the same topic.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Intra- and Multi-sentential Rhetorical Parsing for Document-level Discourse Analysis

We propose a novel approach for developing a two-stage document-level discourse parser. Our parser builds a discourse tree by applying an optimal parsing algorithm to probabilities inferred from two Conditional Random Fields: one for intrasentential parsing and the other for multisentential parsing. We present two approaches to combine these two stages of discourse parsing effectively. A set of...

متن کامل

Exploiting Event Semantics to Parse the Rhetorical Structure of Natural Language Text

Previous work on discourse parsing has mostly relied on surface syntactic and lexical features; the use of semantics is limited to shallow semantics. The goal of this thesis is to exploit event semantics in order to build discourse parse trees (DPT) based on informational rhetorical relations. Our work employs an Inductive Logic Programming (ILP) based rhetorical relation classifier, a Neural N...

متن کامل

Dependency-based Discourse Parser for Single-Document Summarization

The current state-of-the-art singledocument summarization method generates a summary by solving a Tree Knapsack Problem (TKP), which is the problem of finding the optimal rooted subtree of the dependency-based discourse tree (DEP-DT) of a document. We can obtain a gold DEP-DT by transforming a gold Rhetorical Structure Theory-based discourse tree (RST-DT). However, there is still a large differ...

متن کامل

Improving a Pipeline Architecture for Shallow Discourse Parsing

We present a system that implements an end-to-end discourse parser. The system uses a pipeline architecture with seven stages: preprocessing, recognizing explicit connectives, identifying argument positions, identifying and labeling arguments, classifying explicit and implicit connectives, and identifying attribution structures. The discourse structure of a document is inferred based on these c...

متن کامل

Text Parsing of a Complex Genre

A text parsing component designed to be part of a system that assists students in academic reading an writing is presented. The parser can automatically add a relational discourse structure annotation to a scientific article that a user wants to explore. The discourse structure employed is defined in an XML format and is based the Rhetorical Structure Theory. The architecture of the parser comp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012